Modeling Task Uncertainty for Safe Meta-Imitation Learning
نویسندگان
چکیده
منابع مشابه
Safe end-to-end imitation learning for model predictive control
We propose the use of Bayesian networks, which provide both a mean value and an uncertainty estimate as output, to enhance the safety of learned control policies under circumstances in which a test-time input differs significantly from the training set. Our algorithm combines reinforcement learning and end-to-end imitation learning to simultaneously learn a control policy as well as a threshold...
متن کاملMultiagent Collaborative Task Learning through Imitation
Learning through imitation is a powerful approach for acquiring new behaviors. Imitation-based methods have been successfully applied to a wide range of single agent problems, consistently demonstrating faster learning rates compared to exploration-based approaches such as reinforcement learning. The potential for rapid behavior acquisition from human demonstration makes imitation a promising a...
متن کاملDropoutDAgger: A Bayesian Approach to Safe Imitation Learning
While imitation learning is becoming common practice in robotics, this approach often suffers from data mismatch and compounding errors. DAgger is an iterative algorithm that addresses these issues by continually aggregating training data from both the expert and novice policies, but does not consider the impact of safety. We present a probabilistic extension to DAgger, which uses the distribut...
متن کاملOne-Shot Visual Imitation Learning via Meta-Learning
In order for a robot to be a generalist that can perform a wide range of jobs, it must be able to acquire a wide variety of skills quickly and efficiently in complex unstructured environments. High-capacity models such as deep neural networks can enable a robot to represent complex skills, but learning each skill from scratch then becomes infeasible. In this work, we present a meta-imitation le...
متن کاملExploiting Model Uncertainty Estimates for Safe Dynamic Control Learning
Model learning combined with dynamic programming has been shown to be e ective for learning control of continuous state dynamic systems. The simplest method assumes the learned model is correct and applies dynamic programming to it, but many approximators provide uncertainty estimates on the t. How can they be exploited? This paper addresses the case where the system must be prevented from havi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Frontiers in Robotics and AI
سال: 2020
ISSN: 2296-9144
DOI: 10.3389/frobt.2020.606361